Some Open Problems in Optimal AdaBoost and Decision Stumps
نویسندگان
چکیده
The significance of the study of the theoretical and practical properties of AdaBoost is unquestionable, given its simplicity, wide practical use, and effectiveness on real-world datasets. Here we present a few open problems regarding the behavior of “Optimal AdaBoost,” a term coined by Rudin, Daubechies, and Schapire in 2004 to label the simple version of the standard AdaBoost algorithm in which the weak learner that AdaBoost uses always outputs the weak classifier with lowest weighted error among the respective hypothesis class of weak classifiers implicit in the weak learner. We concentrate on the standard, “vanilla” version of Optimal AdaBoost for binary classification that results from using an exponential-loss upper bound on the misclassification training error. We present two types of open problems. One deals with general weak hypotheses. The other deals with the particular case of decision stumps, as often and commonly used in practice. Answers to the open problems can have immediate significant impact to (1) cementing previously established results on asymptotic convergence properties of Optimal AdaBoost, for finite datasets, which in turn can be the start to any convergence-rate analysis; (2) understanding the weak-hypotheses class of effective decision stumps generated from data, which we have empirically observed to be significantly smaller than the typically obtained class, as well as the effect on the weak learner’s running time and previously established improved bounds on the generalization performance of Optimal AdaBoost classifiers; and (3) shedding some light on the “self control” that AdaBoost tends to exhibit in practice. 1 ar X iv :1 50 5. 06 99 9v 1 [ cs .L G ] 2 6 M ay 2 01 5
منابع مشابه
Using AdaBoost and Decision Stumps to Identify Spam E-mail
An existing spam e-mail filter using the Naive Bayes decision engine was retrofitted with one based on the AdaBoost algorithm, using confidence-based weak learners. A comparison of results between the two is presented, with respect to both speed and accuracy.
متن کاملSupport Vector Machines versus Boosting
Support Vector Machines (SVMs) and Adaptive Boosting (AdaBoost) are two successful classification methods. They are essentially the same as they both try to maximize the minimal margin on a training set. In this work, we present an even platform to compare these two learning algorithms in terms of their test error, margin distribution and generalization power. Two basic models of polynomials an...
متن کاملBoosting recombined weak classifiers
Boosting is a set of methods for the construction of classifier ensembles. The differential feature of these methods is that they allow to obtain a strong classifier from the combination of weak classifiers. Therefore, it is possible to use boosting methods with very simple base classifiers. One of the most simple classifiers are decision stumps, decision trees with only one decision node. This...
متن کاملCombining Ordinal Preferences by Boosting
We analyze the relationship between ordinal ranking and binary classification with a new technique called reverse reduction. In particular, we prove that the regret can be transformed between ordinal ranking and binary classification. The proof allows us to establish a general equivalence between the two in terms of hardness. Furthermore, we use the technique to design a novel boosting approach...
متن کاملVariance Penalizing AdaBoost
This paper proposes a novel boosting algorithm called VadaBoost which is motivated by recent empirical Bernstein bounds. VadaBoost iteratively minimizes a cost function that balances the sample mean and the sample variance of the exponential loss. Each step of the proposed algorithm minimizes the cost efficiently by providing weighted data to a weak learner rather than requiring a brute force e...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- CoRR
دوره abs/1505.06999 شماره
صفحات -
تاریخ انتشار 2015